Boosting with Multi-Way Branching in Decision Trees
نویسندگان
چکیده
It is known that decision tree learning can be viewed as a form of boosting. However, existing boosting theorems for decision tree learning allow only binary-branching trees and the generalization to multi-branching trees is not immediate. Practical decision tree algorithms, such as CART and C4.5, implement a trade-off between the number of branches and the improvement in tree quality as measured by an index function . Here we give a boosting justification for a particular quantitative trade-off curve. Our main theorem states, in essence, that if we require an improvement proportional to the log of the number of branches then top-down greedy construction of decision trees remains an effective boosting algorithm.
منابع مشابه
On the Practice of Branching Program Boosting
Branching programs are a generalization of decision trees. From the viewpoint of boosting theory the former appear to be exponentially more eecient. However, earlier experience demonstrates that such results do not necessarily translate to practical success. In this paper we develop a practical version of Mansour and McAllester's 13] algorithm for branching program boosting. We test the algorit...
متن کاملRandom Ordinality Ensembles A Novel Ensemble Method for Multi-valued Categorical Data
Data with multi-valued categorical attributes can cause major problems for decision trees. The high branching factor can lead to data fragmentation, where decisions have little or no statistical support. In this paper, we propose a new ensemble method, Random Ordinality Ensembles (ROE), that circumvents this problem, and provides significantly improved accuracies over other popular ensemble met...
متن کاملBoosting Using Branching Programs
It is known that decision tree learning can be viewed as a form of boosting. Given a weak learning hypothesis one can show that the training error of a decision tree declines as |T| where |T| is the size of the decision tree and b is a constant determined by the weak learning hypothesis. Here we consider the case of decision DAGs—decision trees in which a given node can be shared by different b...
متن کاملThe Di culty of Reduced Error Pruning ofLeveled Branching
Induction of decision trees is one of the most successful approaches to supervised machine learning. Branching programs are a generalization of decision trees and, by the boosting analysis, exponentially more eeciently learnable than decision trees. In experiments this advantage has not been seen to materialize. Decision trees are easy to simplify using pruning. For branching programs no such a...
متن کامل